Add adapt_checkpoint_hparams hook for customizing checkpoint hyperparameter loading #21408

arrdel · 2025-12-06T02:58:56Z

What does this PR do?

This PR adds a public adapt_checkpoint_hparams() hook to LightningCLI that allows users to customize hyperparameters loaded from checkpoints before they are used to instantiate model classes. This solves the problem of loading checkpoints across different module classes (e.g., from TrainingModule to InferenceModule).

Problem

When using LightningCLI with checkpoints, hyperparameters saved during training are automatically loaded and applied when running other subcommands (test, predict, etc.). This is convenient when using the same module class, but fails when using a different class with incompatible __init__ parameters.

Example scenario:

# TrainingModule saves 'lr' hyperparameter
class TrainingModule(LightningModule):
    def __init__(self, lr: float = 1e-3):
        ...

# InferenceModule doesn't accept 'lr'
class InferenceModule(LightningModule):
    def __init__(self):  # No 'lr' parameter!
        ...

Running cli predict --ckpt_path checkpoint.ckpt with InferenceModule fails because the CLI tries to pass lr=1e-3 from the checkpoint to InferenceModule.__init__().

Solution

Added adapt_checkpoint_hparams() public method that users can override to customize loaded hyperparameters:

class MyCLI(LightningCLI):
    def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:
        # Remove training-specific hyperparameters
        checkpoint_hparams.pop("lr", None)
        checkpoint_hparams.pop("weight_decay", None)
        return checkpoint_hparams

Implementation Details

Added: adapt_checkpoint_hparams() public method in LightningCLI
Modified: _parse_ckpt_path() to call the hook after loading but before applying hyperparameters
Backward compatible: Default implementation returns hyperparameters unchanged
Flexible: Users can remove, modify, or completely disable checkpoint hyperparameters

Why This Approach?

As discussed in #21255, this is superior to alternatives:

Better than disabling checkpoint loading: Preserves valuable hyperparameter information (e.g., hidden_dim)
Better than CLI flags: Maintains consistency with Trainer parameter pattern
Better than modifying private methods: Provides official public API

Testing

The implementation:

✅ Maintains backward compatibility (existing code unaffected)
✅ Provides maximum flexibility via public hook
✅ Works with both regular and subclass module modes
✅ Handles _class_path modification when needed

Example Use Cases

Remove training-only parameters:

def adapt_checkpoint_hparams(self, hparams):
    hparams.pop("lr", None)
    return hparams

Change module class in subclass mode:

def adapt_checkpoint_hparams(self, hparams):
    hparams["_class_path"] = "mymodule.InferenceModule"
    return hparams

Disable all checkpoint hyperparameters:

def adapt_checkpoint_hparams(self, hparams):
    return {}

Does your PR introduce any breaking changes?

No, this is a purely additive change. The default implementation returns hyperparameters unchanged, preserving existing behavior.

Before submitting

Was this discussed/approved via a GitHub issue? Yes - Allow weight reuse in a different lightning module #21255
Did you read the contributor guideline?
Did you make sure your PR does only one thing, instead of bundling different changes together?
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG?

PR review

Is this pull request ready for review? Yes
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

cc: @mauvilsa @ziw-liu

📚 Documentation preview 📚: https://pytorch-lightning--21408.org.readthedocs.build/en/21408/

…ameter loading Fixes Lightning-AI#21255 This commit adds the adapt_checkpoint_hparams() public method to LightningCLI, allowing users to customize hyperparameters loaded from checkpoints before they are used to instantiate model classes. This is particularly useful when using checkpoints from a TrainingModule with a different InferenceModule class that has different __init__ parameters. Problem: When loading a checkpoint trained with TrainingModule(lr=1e-3) into an InferenceModule() that doesn't accept 'lr' as a parameter, the CLI would fail during instantiation because it tries to pass all checkpoint hyperparameters to the new module class. Solution: Added adapt_checkpoint_hparams() hook that is called in _parse_ckpt_path() after loading checkpoint hyperparameters but before applying them. Users can override this method to: - Remove training-specific hyperparameters (e.g., lr, weight_decay) - Modify _class_path for subclass mode - Transform hyperparameter names/values - Completely disable checkpoint hyperparameters by returning {} Example usage: class MyCLI(LightningCLI): def adapt_checkpoint_hparams(self, checkpoint_hparams): checkpoint_hparams.pop('lr', None) checkpoint_hparams.pop('weight_decay', None) return checkpoint_hparams This approach is preferable to: - Disabling checkpoint loading entirely (loses valuable hyperparameter info) - Adding CLI arguments (deviates from Trainer parameter pattern) - Modifying private methods (breaks encapsulation) The hook provides maximum flexibility while maintaining backward compatibility (default implementation returns hyperparameters unchanged).

for more information, see https://pre-commit.ci

Copilot

Pull request overview

This PR adds a public adapt_checkpoint_hparams() hook to LightningCLI that enables users to customize hyperparameters loaded from checkpoints before model instantiation. This addresses the issue of loading checkpoints across different module classes (e.g., from TrainingModule to InferenceModule) where incompatible __init__ parameters would otherwise cause failures.

Key Changes:

Added adapt_checkpoint_hparams() public method with comprehensive documentation
Integrated the hook into _parse_ckpt_path() to allow customization before hyperparameter application
Maintained backward compatibility with a default no-op implementation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-06T03:01:33Z

src/lightning/pytorch/cli.py

+    def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:
+        """Adapt checkpoint hyperparameters before instantiating the model class.
+
+        This method allows for customization of hyperparameters loaded from a checkpoint when
+        using a different model class than the one used for training. For example, when loading
+        a checkpoint from a TrainingModule to use with an InferenceModule that has different
+        ``__init__`` parameters, you can remove or modify incompatible hyperparameters.
+
+        Args:
+            checkpoint_hparams: Dictionary of hyperparameters loaded from the checkpoint.
+
+        Returns:
+            Dictionary of adapted hyperparameters to be used for model instantiation.
+
+        Example::
+
+            class MyCLI(LightningCLI):
+                def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:
+                    # Remove training-specific hyperparameters not needed for inference
+                    checkpoint_hparams.pop("lr", None)
+                    checkpoint_hparams.pop("weight_decay", None)
+                    return checkpoint_hparams
+
+        Note:
+            If subclass module mode is enabled and ``_class_path`` is present in the checkpoint
+            hyperparameters, you may need to modify it as well to point to your new module class.
+
+        """
+        return checkpoint_hparams


The new adapt_checkpoint_hparams() hook lacks test coverage. Given that tests/tests_pytorch/test_cli.py contains comprehensive tests for checkpoint loading functionality (e.g., test_lightning_cli_ckpt_path_argument_hparams and test_lightning_cli_ckpt_path_argument_hparams_subclass_mode), tests should be added to verify:

The hook is called when loading checkpoint hyperparameters

Modifications made in the hook are applied correctly

Returning an empty dict properly skips checkpoint hyperparameter loading

The hook works in both regular and subclass modes

Copilot · 2025-12-06T03:01:34Z

src/lightning/pytorch/cli.py

        else:
            self.config = parser.parse_args(args)

+    def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:


Use lowercase dict instead of Dict for type annotations to align with the modern Python 3.9+ style used throughout this file. Change Dict[str, Any] to dict[str, Any] in both the parameter and return type annotations.

Copilot · 2025-12-06T03:01:34Z

src/lightning/pytorch/cli.py

+        Example::
+
+            class MyCLI(LightningCLI):
+                def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:


Use lowercase dict instead of Dict for type annotations to align with the modern Python 3.9+ style used throughout this file. Change Dict[str, Any] to dict[str, Any] in both the parameter and return type annotations.

mauvilsa

It is looking good. However, the subcommand parameter is missing. Also please add unit tests.

mauvilsa · 2025-12-08T13:58:54Z

src/lightning/pytorch/cli.py

        else:
            self.config = parser.parse_args(args)

+    def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:


Suggested change

def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:

def adapt_checkpoint_hparams(self, subcommand: str, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:

As mentioned in my proposal, the method should receive a subcommand parameter.

mauvilsa · 2025-12-08T14:00:34Z

src/lightning/pytorch/cli.py

+                    checkpoint_hparams.pop("lr", None)
+                    checkpoint_hparams.pop("weight_decay", None)


In this example, removing lr and weight_decay should not be done if the subcommand is fit.

mauvilsa · 2025-12-08T14:01:11Z

src/lightning/pytorch/cli.py

                return
+
+            # Allow customization of checkpoint hyperparameters via adapt_checkpoint_hparams hook
+            hparams = self.adapt_checkpoint_hparams(hparams)


Suggested change

hparams = self.adapt_checkpoint_hparams(hparams)

hparams = self.adapt_checkpoint_hparams(subcommand, hparams)

…ook and add tests - Update adapt_checkpoint_hparams signature to include subcommand parameter allowing context-aware customization of checkpoint hyperparameters - Change type annotations to use lowercase dict (Python 3.9+ style) - Update docstring with subcommand parameter documentation - Add example showing conditional logic based on subcommand - Add comprehensive unit tests: - test_adapt_checkpoint_hparams_hook: Tests that hook is called and modifications applied - test_adapt_checkpoint_hparams_hook_empty_dict: Tests disabling checkpoint hparams loading - Tests cover both regular and subclass modes

for more information, see https://pre-commit.ci

arrdel · 2025-12-09T15:46:53Z

Thanks for the response.

I already updated Dict[str, Any] → dict[str, Any] for Python 3.9+ compatibility

Also added subcommand parameter to adapt_checkpoint_hparams() signature, now users can apply different adaptations based on context (fit vs predict vs test vs validate)

def adapt_checkpoint_hparams(self, subcommand: str, checkpoint_hparams: dict[str, Any]) -> dict[str, Any]:
    if subcommand != "fit":
        checkpoint_hparams.pop("lr", None)  # Remove training params for inference
    return checkpoint_hparams

I also included 2 comprehensive tests:

test_adapt_checkpoint_hparams_hook(): Verifies hook is called and modifications applied
test_adapt_checkpoint_hparams_hook_empty_dict(): Tests disabling checkpoint hparams

- Split method signature across multiple lines to stay within 120 char limit - Improves code readability in documentation example

mauvilsa

It is looking good. But the two tests fail. You will need to implement a new Model class for these tests.

mauvilsa · 2025-12-09T22:22:00Z

tests/tests_pytorch/test_cli.py

    assert cli.model.layer.out_features == 4


+def test_adapt_checkpoint_hparams_hook(cleandir):


Suggested change

def test_adapt_checkpoint_hparams_hook(cleandir):

def test_adapt_checkpoint_hparams_hook_pop_keys(cleandir):

mauvilsa · 2025-12-09T22:25:16Z

tests/tests_pytorch/test_cli.py

+        def add_arguments_to_parser(self, parser):
+            parser.link_arguments("model.out_dim", "model.hidden_dim", compute_fn=lambda x: x * 2)
+


Suggested change

def add_arguments_to_parser(self, parser):

parser.link_arguments("model.out_dim", "model.hidden_dim", compute_fn=lambda x: x * 2)

Linking of arguments is not relevant to test this hook. Better to not have it to avoid distraction.

mauvilsa · 2025-12-09T22:25:27Z

tests/tests_pytorch/test_cli.py

+        def add_arguments_to_parser(self, parser):
+            parser.link_arguments("model.out_dim", "model.hidden_dim", compute_fn=lambda x: x * 2)
+


Suggested change

def add_arguments_to_parser(self, parser):

parser.link_arguments("model.out_dim", "model.hidden_dim", compute_fn=lambda x: x * 2)

Linking of arguments is not relevant to test this hook. Better to not have it to avoid distraction.

mauvilsa · 2025-12-09T22:32:39Z

tests/tests_pytorch/test_cli.py

+    # First, create a checkpoint
+    cli_args = ["fit", "--model.out_dim=3", "--trainer.max_epochs=1"]
+    with mock.patch("sys.argv", ["any.py"] + cli_args):
+        cli = AdaptHparamsEmptyCLI(BoringCkptPathModel)


The test fails because of BoringCkptPathModel has a module torch.nn.Linear(32, out_dim). If the out_dim is changed, then there is a tensor size mismatch.

Instead of using BoringCkptPathModel, implement a new class for these two tests, that just sets an attribute that can be asserted after instantiation.

… size mismatch in tests

for more information, see https://pre-commit.ci

arrdel · 2025-12-12T10:35:31Z

@mauvilsa Thanks for the detailed feedback! I've successfully implemented your suggestion.

You correctly identified that the tests were failing due to tensor size mismatches. The original tests used BoringCkptPathModel which has torch.nn.Linear(32, out_dim) with hard-coded input dimensions. When out_dim changed between fit (3) and predict phases (5 or 8), it caused dimension mismatch errors.

I created a new, simple model class specifically for testing the hook:

AdaptHparamsModel - A lightweight model designed for testing:

Stores out_dim and hidden_dim as simple attributes (no dynamic layers)
Fixed torch.nn.Linear(10, 2) layer that doesn't depend on hyperparameters
Allows hyperparameters to change freely between fit and predict phases without errors

Tests Updated

test_adapt_checkpoint_hparams_hook_pop_keys() - Tests removing training-specific hyperparameters
- Removed the irrelevant link_arguments setup (as you suggested)
- Now focuses solely on the hook's functionality
- Uses AdaptHparamsModel instead of BoringCkptPathModel
test_adapt_checkpoint_hparams_hook_empty_dict() - Tests disabling checkpoint hparams
- Also uses AdaptHparamsModel
- Verifies that returning an empty dict uses default values

mauvilsa

@arrdel your tests still fail. It would be better if you run the tests locally and be sure that all works correctly before pushing. There are instruction on how to do that. More or less how I do it in linux is (I don't remember exactly):

Create virtual env
Install lightning like export PACKAGE_NAME=lightning ; pip install -e ".[test]"
Install jsonargparse like pip install "jsonargparse[signatures]"

Then to run only the CLI tests, I do:

export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python
pytest -v tests/tests_pytorch/test_cli.py

mauvilsa · 2025-12-23T14:27:24Z

src/lightning/pytorch/cli.py

+            hyperparameters, you may need to modify it as well to point to your new module class.
+
+        """
+        return checkpoint_hparams


Not really related to this new feature, but there is also my comment in #21116 (comment). Nobody responded to it. Maybe by default fit should not use the hparams from the checkpoint?

Also this could be related #21255 (comment)

I am not really sure what to do here.

@arrdel any comment on this?

Actually, it seems #21455 would fix this comment, I think.

Removed redundant method implementations since BoringModel provides them.

arrdel · 2025-12-25T18:37:54Z

@mauvilsa Thanks for the feedback! I've made the fix you suggested:

Changed AdaptHparamsModel to inherit from BoringModel instead of LightningModule
Removed the redundant method implementations (forward, training_step, configure_optimizers) since BoringModel provides all required training methods

The test was asserting hidden_dim==3 but only passing out_dim=3. Since hidden_dim defaults to 16 and there's no argument linking, the assertion failed. Now we explicitly pass --model.hidden_dim=6.

codecov · 2025-12-26T00:53:39Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 79%. Comparing base (79ffe50) to head (204afb7).
⚠️ Report is 29 commits behind head on master.
✅ All tests successful. No failed tests found.

❗ There is a different number of reports uploaded between BASE (79ffe50) and HEAD (204afb7). Click for more details.

HEAD has 3345 uploads less than BASE

Flag BASE (79ffe50) HEAD (204afb7)

cpu 777 30

lightning_fabric 195 0

pytest 390 0

python3.12 233 9

python3.12.7 232 9

lightning 388 15

python3.11 156 6

python3.10 78 3

python 78 3

pytorch2.1 78 6

pytest-full 387 30

pytorch_lightning 194 15

pytorch2.6 39 3

pytorch2.4.1 38 3

pytorch2.3 39 3

pytorch2.2.2 39 3

pytorch2.5.1 38 3

pytorch2.9 39 3

pytorch2.7 39 3

pytorch2.8 38 3

Additional details and impacted files

@@            Coverage Diff            @@
##           master   #21408     +/-   ##
=========================================
- Coverage      87%      79%     -8%     
=========================================
  Files         269      267      -2     
  Lines       23804    24009    +205     
=========================================
- Hits        20626    18957   -1669     
- Misses       3178     5052   +1874

mauvilsa

Great to see that the tests are now successful. I still have these two comments, but overall it looks good so I approve now. Anyway, my approval is not that useful since still someone from the lightning team needs to approve.

mauvilsa · 2025-12-26T14:19:52Z

tests/tests_pytorch/test_cli.py

+                checkpoint_hparams.pop("out_dim", None)
+                checkpoint_hparams.pop("hidden_dim", None)


From a testing perspective there is no difference between out_dim and hidden_dim. It might be better if one is popped and the other not, so that both cases are tested?

mauvilsa · 2025-12-26T14:20:18Z

src/lightning/pytorch/cli.py

+            hyperparameters, you may need to modify it as well to point to your new module class.
+
+        """
+        return checkpoint_hparams


@arrdel any comment on this?

mauvilsa · 2025-12-26T14:25:13Z

For some reason I am unable to resolve my old comments that have been addressed already.

Copilot AI review requested due to automatic review settings December 6, 2025 02:58

arrdel requested review from ethanwharris, justusschock, lantiga and tchaton as code owners December 6, 2025 02:58

github-actions bot added the pl Generic label for PyTorch Lightning package label Dec 6, 2025

[pre-commit.ci] auto fixes from pre-commit.com hooks

ad1a028

for more information, see https://pre-commit.ci

Copilot started reviewing on behalf of arrdel December 6, 2025 02:59 View session

Copilot AI reviewed Dec 6, 2025

View reviewed changes

mauvilsa suggested changes Dec 8, 2025

View reviewed changes

arrdel and others added 2 commits December 9, 2025 10:39

[pre-commit.ci] auto fixes from pre-commit.com hooks

00e7032

for more information, see https://pre-commit.ci

fix: Break long line in adapt_checkpoint_hparams docstring example

b3b1025

- Split method signature across multiple lines to stay within 120 char limit - Improves code readability in documentation example

mauvilsa suggested changes Dec 9, 2025

View reviewed changes

arrdel and others added 2 commits December 12, 2025 05:28

fix: Replace BoringCkptPathModel with AdaptHparamsModel to fix tensor…

fc8cc3a

… size mismatch in tests

[pre-commit.ci] auto fixes from pre-commit.com hooks

a0f0d77

for more information, see https://pre-commit.ci

mauvilsa suggested changes Dec 23, 2025

View reviewed changes

mauvilsa reviewed Dec 23, 2025

View reviewed changes

fix: Change AdaptHparamsModel to inherit from BoringModel

17d3b30

Removed redundant method implementations since BoringModel provides them.

fix: Pass hidden_dim explicitly in test to fix assertion

204afb7

The test was asserting hidden_dim==3 but only passing out_dim=3. Since hidden_dim defaults to 16 and there's no argument linking, the assertion failed. Now we explicitly pass --model.hidden_dim=6.

mauvilsa approved these changes Dec 26, 2025

View reviewed changes

mauvilsa mentioned this pull request Dec 30, 2025

Fix: Override hparams via CLI #21455

Open

7 tasks

	def adapt_checkpoint_hparams(self, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:
	def adapt_checkpoint_hparams(self, subcommand: str, checkpoint_hparams: Dict[str, Any]) -> Dict[str, Any]:

		checkpoint_hparams.pop("lr", None)
		checkpoint_hparams.pop("weight_decay", None)

	hparams = self.adapt_checkpoint_hparams(hparams)
	hparams = self.adapt_checkpoint_hparams(subcommand, hparams)

		assert cli.model.layer.out_features == 4


		def test_adapt_checkpoint_hparams_hook(cleandir):

	def test_adapt_checkpoint_hparams_hook(cleandir):
	def test_adapt_checkpoint_hparams_hook_pop_keys(cleandir):

		def add_arguments_to_parser(self, parser):
		parser.link_arguments("model.out_dim", "model.hidden_dim", compute_fn=lambda x: x * 2)

		checkpoint_hparams.pop("out_dim", None)
		checkpoint_hparams.pop("hidden_dim", None)

Add adapt_checkpoint_hparams hook for customizing checkpoint hyperparameter loading #21408

Are you sure you want to change the base?

Add adapt_checkpoint_hparams hook for customizing checkpoint hyperparameter loading #21408

Conversation

arrdel commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Problem

Solution

Implementation Details

Why This Approach?

Testing

Example Use Cases

Does your PR introduce any breaking changes?

Before submitting

PR review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

mauvilsa left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arrdel commented Dec 9, 2025

Uh oh!

mauvilsa left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arrdel commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tests Updated

Uh oh!

mauvilsa left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arrdel commented Dec 25, 2025

Uh oh!

codecov bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mauvilsa left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mauvilsa commented Dec 26, 2025

Uh oh!

arrdel commented Dec 6, 2025 •

edited

Loading

arrdel commented Dec 12, 2025 •

edited

Loading

codecov bot commented Dec 26, 2025 •

edited

Loading